Multimedia System with Processing of Multimedia Data Streams

ABSTRACT

A media system is disclosed that records and/or stores images, video, and/or audio representing a scene in its field of view into a multimedia data stream. The media system extracts and/or frames one or more particular objects from the images, video, and/or audio of the multimedia data stream and/or from images, video, and/or audio of previously recorded multimedia data streams to provide a processed multimedia data stream. The media system plays back the images, video, and/or audio of the processed multimedia data stream.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application claims the benefit of U.S. Provisional Patent Appl. No. 61/549,495, filed Oct. 20, 2011, which is incorporated herein by reference in its entirety.

BACKGROUND

1. Field of Disclosure

The present disclosure relates generally to multimedia data streams, and more specifically to recording and playing back of processed multimedia data streams.

2. Related Art

A conventional media capture module, such as a digital camera to provide an example, can record and/or store a scene in its field of view. These conventional media capture modules can include extremely wide angle lenses, such as a pin hole lens and a fish eye lens to provide some examples, to capture images of the scene within a large field of view at very high resolutions. Often, conventional media playback devices, such as monitors, televisions, mobile communications devices, such as a smart phones or portable computers to provide some examples, are only capable of playing back the images of the scene at much lower resolutions. This lower resolution of the conventional media playback devices allows the high resolution images of the scene to be modified by, for example, zooming, cutting, and/or cropping, for play back without detrimentally affecting the quality of the images.

Conventionally, users of the conventional media capture module have to manually modify the high resolution images of the scene to select, extract, and/or merge one or more images from a multimedia data stream depicting the scene. As a result, participation in the scene by these users is rather limited. For example, a parent at a birthday party conventionally operates the conventional media capture module to capture images, video, and or audio of the birthday party. As a result, this parent's participation in the birthday party is rather limited.

BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES

Embodiments of the disclosure are described with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Additionally, the left most digit(s) of a reference number identifies the drawing in which the reference number first appears.

FIG. 1 illustrates a block diagram of an exemplary remote processing media system according to an exemplary embodiment of the present disclosure;

FIG. 2 illustrates a block diagram of a second exemplary remote processing media system according to an exemplary embodiment of the present disclosure;

FIG. 3 illustrates a block diagram of a third exemplary remote processing media system according to an exemplary embodiment of the present disclosure;

FIG. 4 illustrates a block diagram of an exemplary local processing media system according to an exemplary embodiment of the present disclosure;

FIG. 5 illustrates a block diagram of a first media capture module that can be used within the exemplary video camera system according to an exemplary embodiment of the present disclosure; and

FIG. 6 illustrates a block diagram of a second media capture module that can be used within the exemplary video camera system according to an exemplary embodiment of the present disclosure.

The disclosure will now be described with reference to the accompanying drawings. In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. The drawing in which an element first appears is indicated by the leftmost digit(s) in the reference number.

DETAILED DESCRIPTION OF THE DISCLOSURE

The following Detailed Description refers to accompanying drawings to illustrate exemplary embodiments consistent with the disclosure. References in the Detailed Description to “one exemplary embodiment,” “an exemplary embodiment,” “an example exemplary embodiment,” etc., indicate that the exemplary embodiment described can include a particular feature, structure, or characteristic, but every exemplary embodiment can not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same exemplary embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an exemplary embodiment, it is within the knowledge of those skilled in the relevant art(s) to affect such feature, structure, or characteristic in connection with other exemplary embodiments whether or not explicitly described.

The exemplary embodiments described herein are provided for illustrative purposes, and are not limiting. Other exemplary embodiments are possible, and modifications can be made to the exemplary embodiments within the spirit and scope of the disclosure. Therefore, the Detailed Description is not meant to limit the disclosure. Rather, the scope of the disclosure is defined only in accordance with the following claims and their equivalents.

Embodiments of the disclosure can be implemented in hardware, firmware, software, or any combination thereof. Embodiments of the disclosure can also be implemented as instructions stored on a machine-readable medium, which can be read and executed by one or more processors. A machine-readable medium can include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computing device). For example, a machine-readable medium can include non-transitory machine-readable mediums such as read only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; and others. As another example, the machine-readable medium can include transitory machine-readable medium such as electrical, optical, acoustical, or other forms of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.). Further, firmware, software, routines, instructions can be described herein as performing certain actions. However, it should be appreciated that such descriptions are merely for convenience and that such actions in fact result from computing devices, processors, controllers, or other devices executing the firmware, software, routines, instructions, etc.

The following Detailed Description of the exemplary embodiments will so fully reveal the general nature of the disclosure that others can, by applying knowledge of those skilled in relevant art(s), readily modify and/or adapt for various applications such exemplary embodiments, without undue experimentation, without departing from the spirit and scope of the disclosure. Therefore, such adaptations and modifications are intended to be within the meaning and plurality of equivalents of the exemplary embodiments based upon the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by those skilled in relevant art(s) in light of the teachings herein.

For purposes of this discussion, the term “module” shall be understood to include at least one of software, firmware, and hardware (such as one or more circuits, microchips, or devices, or any combination thereof), and any combination thereof. In addition, it will be understood that each module can include one, or more than one, component within an actual device, and each component that forms a part of the described module can function either cooperatively or independently of any other component forming a part of the module. Conversely, multiple modules described herein can represent a single component within an actual device. Further, components within a module can be in a single device or distributed among multiple devices in a wired or wireless manner.

Overview

The following Detailed Description describes various media systems that can record and/or store a scene in their field of view as a multimedia data stream. Typically, these various media systems capture the scene in resolutions that exceed resolutions that can be played back. As a result, the various media systems can be processed by, for example, zooming, cutting, and/or cropping, for play back without detrimentally affecting the quality of the scene.

The various media systems can automatically process the multimedia data streams depicting the scene locally or remotely through a communications network with minimal user assistance. The user of these media systems can identify one or more particular objects, such as one or more particular people, one or more particular animals, or one or more particular objects, one or more particular scenes, or one or more particular voices, and/or or one or more particular backgrounds to provide some examples, from a scene in a field of view of the media systems. Thereafter, the media systems can automatically select, extract, and/or merge portions of a multimedia data stream depicting the scene that includes the one or more particular objects to compile a new multimedia data stream as a processed multimedia data stream for playback. For example, users of the media systems can identify one or more people from the scene. In this example, the media systems select, extract, and/or merge portions of the multimedia data stream that include these people to provide the processed multimedia data stream. This effectively allows the media systems to essentially track or to follow the people within the scene.

Additionally, the media systems can store previously recorded multimedia data streams of other scenes. These other scenes can be different scenes than depicted in the multimedia data stream or similar scenes as depicted in the multimedia data stream at different times to provide some examples. The user of these media systems can identify one or more particular objects, can identify the one or more particular objects from these previously recorded multimedia data streams in a similar manner as identifying the one or more particular objects from the multimedia data stream. The media systems can automatically select, extract, and/or merge portions of the previously recorded multimedia data streams that include the one or more particular objects and compile these extracted and/or framed portions, as well as extracted and/or framed portions of the multimedia data stream, as a new multimedia data stream as a processed multimedia data stream for playback. For example, users of the media systems can identify one or more people from previously captured scenes. In this example, the media systems selects, extracts, and/or merges portions of the previously recorded multimedia data streams that include these people and compile these extracted and/or framed portions to provide the processed multimedia data stream.

First Exemplary Remote Processing Video Camera System

FIG. 1 illustrates a block diagram of an exemplary remote processing media system according to an exemplary embodiment of the present disclosure. An exemplary media system 100 locally records and/or stores images, video, and/or audio representing a scene in its field of view into a multimedia data stream. The exemplary media system 100 remotely extracts and/or frames one or more particular objects from the images, video, and/or audio of the multimedia data stream and/or from images, video, and/or audio of previously recorded multimedia data streams to provide a processed multimedia data stream for playback. The exemplary media system 100 includes a media capture module 102, a remote media processing system 104, a remote storage system 106, and a media playback device 108 that are communicatively coupled via a communication network 110.

The media capture module 102 records and/or stores a scene in its field of view as a multimedia data stream. Specifically, media capture module 102 records and/or stores images, video, and/or audio representing a scene in its field of view into a multimedia data stream. The multimedia data stream can represent a still image of the scene at a particular instance in time, commonly referred to as an image, a series of the images over a duration in time that represents the scene in motion, commonly referred to a video, a representation of various sounds occurring within, or near, the scene, commonly referred to an audio, or any combination thereof The media capture module 102 can represent a digital camera, a digital video camera, a mobile communications device, such as a smart phone or portable computer, that has an integrated camera, or any other electronic device that is capable of recording and/or storing images, video, and/or audio that will be apparent to those skilled in the relevant art(s) without departing from the spirit and scope of present disclosure.

Typically, the media capture module 102 provides the multimedia data stream to the remote media processing system 104 for processing via the communication network 110. Optionally, the media capture module 102 can pre-process the multimedia data stream before providing it to the remote media processing system 104. This pre-processing can include various imaging processing techniques such as cropping, image straightening, red-eye effect removal, contrast adjustment, color adjustment, image retouching to provide some examples. Additionally, the media capture module 102 can communicate with the remote media processing system 104 to assist the remote media processing system 104. The media capture module 102 can include an interface with the remote media processing system 104 to direct processing by the remote media processing system 104 in response to commands from the media capture module 102. These commands can be automatically generated by the media capture module 102 and/or be generated in response to a user of the media capture module 102. For example, the media capture module 102 can playback the multimedia data stream using the interface. A user of the media capture module 102 can provide user information to assist in identifying one or more particular objects, such as one or more particular people, one or more particular animals, or one or more particular objects, one or more particular scenes, or one or more particular voices, one or more particular backgrounds, or any other supplemental, known objects to provide some examples, from the multimedia data stream for selecting, extracting, and/or merging. The user of the media capture module 102 can touch, or make a gesture around, the one or more particular objects as the multimedia data stream is being played back. The media capture module 102 can send the commands to cause the remote media processing system 104 to select, to extract, and/or to merge these one or more particular objects from the multimedia data stream.

In some situations, the media capture module 102 can capture multiple independent multimedia data streams that each correspond to a scene from among a series of independent scenes. For example, a first user of the media capture module 102 can capture a first scene as a first independent multimedia data stream and the first user, or a second user, of the media capture module 102 can capture a second scene as a second independent multimedia data stream. Typically, each scene from among the series of independent scenes is independent from each other in location, duration, and/or time to provide some examples.

The remote media processing system 104 operates in conjunction with the remote storage system 106 to process multimedia data streams, such as the multimedia data stream and/or previously recorded multimedia data streams of other scenes that are stored within the remote storage system 106. These other scenes can be different scenes than depicted in the multimedia data stream or similar scenes as depicted in the multimedia data stream at different times to provide some examples. Typically, the media capture module 102 and/or the media playback device 108 provide commands to the remote media processing system 104 to identify the one or more particular objects from the multimedia data stream and/or from the previously recorded multimedia data streams. Additionally, the remote media processing system 104 can automatically identify the one or more particular objects from the multimedia data stream and/or from previously recorded multimedia data streams. For example, the user of the media capture module 102 and/or of the media playback device 108 can provide user information to assist in identifying the one or more particular objects from the multimedia data stream and/or the previously recorded multimedia data streams for selecting, extracting, and/or merging. This user information can include one or more pointers, such as one or more textual inputs, that can be used to identify the one or more particular objects. Next, the remote media processing system 104 can retrieve a portion of the previously recorded multimedia data streams from the remote storage system 106 that corresponds to the one or more pointers and select, extract, and/or merge the one or more particular objects from the multimedia data stream and/or the previously recorded multimedia data streams using the portion of the previously recorded multimedia data streams that corresponds to the one or more pointers.

Generally, the remote media processing system 104 identifies one or more objects within the multimedia data stream and/or the previously recorded multimedia data streams and recognizes the one or more objects as being the one or more particular objects using various image, video, and/or audio recognition techniques. Typically, these image, video, and/or audio recognition techniques compare a portion of previously recorded multimedia data streams that include image, video, and/or audio of various previously recognized objects that are stored within the remote storage system 106 with the one or more objects within the multimedia data stream and/or the previously recorded multimedia data streams to recognize the one or more particular objects. The previously recognized objects represent supplemental known information such as one or more previously recognized people, one or more previously recognized animals, or one or more previously recognized objects, one or more previously recognized scenes, or one or more previously recognized voices, and/or or one or more previously recognized backgrounds to provide some examples. Also, the remote media processing system 104 can identify one or more objects that are common between the multimedia data stream and/or the previously recorded multimedia data streams and to recognizes these one or more common objects as being the one or more particular objects using various image, video, and/or audio recognition techniques.

Additionally, the remote media processing system 104 can request assistance with recognition of one or more objects from the media capture module 102 and/or the media playback device 108 when none of the one or more objects have been previously recognized. For example, when the multimedia data stream and/or the previously recorded multimedia data streams includes one or more new objects that have not been previously recognized by the remote media processing system 104, the remote media processing system 104 can request assistance to recognize these new objects as previously recognized objects in the future.

After identification of the one or more objects, the remote media processing system 104 can select, extract, and/or merge portions of the multimedia data stream and/or the previously recorded multimedia data streams that include the one or more particular objects from the images, video, and/or audio of the multimedia data stream and/or the previously recorded multimedia data streams and compile a new multimedia data stream that includes these portions as the processed multimedia data stream for playback by the media playback device 108. The remote media processing system 104 can select, extract, and/or merge the one or more particular objects from the images, video, and/or audio of the multimedia data stream and/or the previously recorded multimedia data streams as well as selecting, extracting, and/or merging other portions of the images, video, and/or audio of the multimedia data stream and/or the previously recorded multimedia data streams that surround the one or more particular objects.

The remote storage system 106 stores the multimedia data stream for future retrieval by the remote media processing system 104. Also, the remote storage system 106 stores previously recorded images, video, and/or audio of previously recorded multimedia data streams for recognition of the one or more particular objects by the remote media processing system 104. Typically, the remote storage system 106 stores portions of previously recorded multimedia data streams that include previously recognized objects which can be indexed by one or pointers. The remote storage system 106 can receive one or more pointers from the remote media processing system 104 and provide the portions of previously recorded multimedia data streams that correspond to the one or more pointers. The remote storage system 106 can update the previously recorded images, video, and/or audio when new objects are recognized by the remote media processing system 104 within the multimedia data stream and/or the previous multimedia data streams.

The media playback device 108 plays back the processed multimedia data stream from the remote media processing system 104. The media playback device 108 can display the images and/or the video and/or play back the audio within the processed multimedia data stream. Optionally, the media playback device 108 can post-process the processed multimedia data stream before play back. This post-processing can include various imaging processing techniques such as cropping, image straightening, red-eye effect removal, contrast adjustment, color adjustment, image retouching to provide some examples. Additionally, the media playback device 108 can communicate with the remote media processing system 104 to assist the remote media processing system 104. The media playback device 108 can include an interface with the remote media processing system 104 to direct processing by the remote media processing system 104 in response to commands from the media playback device 108. For example, the media playback device 108 can playback the processed multimedia data stream using the interface. A user of the user of the media playback device 108 can provide user information regarding the one or more particular objects, from the processed multimedia data stream for selecting, extracting, and/or merging. For example, the user of the media playback device 108 can identify the one or more particular objects from the processed multimedia data stream for selecting, extracting, and/or merging. The user of the media playback device 108 can simply touch, or make a gesture around, the one or more particular objects as the processed multimedia data stream is being played back. The media playback device 108 can send the commands to cause the remote media processing system 104 to select, to extract, and/or to merge these one or more particular objects from the processed multimedia data. The media playback device 108 can include a monitor, a television, a mobile communications device, such as a smart phone or portable computer, or any other electronic device that is capable of playing back the processed multimedia data stream that will be apparent to those skilled in the relevant art(s) without departing from the spirit and scope of present disclosure.

The communication network 110 communicatively couples the media capture module 102, the remote media processing system 104, the remote storage system 106, and the media playback device 108. The communication network 110 can include any suitable wireless communication network, such as a cellular network to provide an example, any suitable wired communication network, such as a fiber optic network or cable network to provide some examples, or any combination of wireless and wired communication networks.

Although the description of FIG. 1 illustrates a single media capture module 102 and/or a single media playback device 108, those skilled in the relevant art(s) will recognize that the remote media processing system 104 and the remote storage system 106 can service multiple media capture modules 102 and/or multiple media playback devices 108 without departing from the spirit and scope of the present disclosure. These multiple media capture modules 102 and/or multiple media playback devices 108 can provide multimedia data streams and/or commands from multiple users of these devices to the remote media processing system 104 for processing in a substantially similar manner as described above. Additionally, those skilled in the relevant art(s) will recognize that the media capture module 102 and the media playback device 108 can be integrated onto a single platform, such as mobile communications device, a personal computer, a laptop computer, or any other integrated platform without departing from the spirit and scope of the present invention.

Moreover, aspects of the present disclosure can be integrated within conventional platforms so as to provide enhanced services. For example, the remote media processing system 104 can be integrated within a web search service platform, such as those provided by Google™ or Bing™ to provide some examples. A first multimedia data stream, gathered by multiple media capture modules, such as multiple media capture modules 102, of each of a first plurality of users, is uploaded to search engine platform for secure, encrypted storage in return for a service fee. A second multimedia data stream, gathered by such multiple media capture modules, is stored locally, also with secure encryption. The remote media processing system 104 operates within the web search engine services platform by performing image, audio and video recognition and constructing a reverse indexed recognition database wherein textual descriptions are associated with recognized objects, sounds, speakers, persons, buildings, scenes, animals, and so on. Such recognition approaches can be shared with those used for image, audio and video search services. That is, search platforms currently support searching of a plurality of web located images using an uploaded image. This can easily be extended to searching amongst image frames in video and searching audio using an audio upload segment. Thus, by using pre-recognized audio, image and video elements, other audio, video and image elements can be found in other media. For example, known and textually identified items such as the Eiffel Tower can be recognized currently in an uploaded image. By processing all uploaded media in such a manner, various common content can be identified within the first multimedia data stream. Similarly, by recognizing similarities within the uploaded first multimedia data stream itself, popular elements can be grouped and, with user assistance, fully identified via prompting for associated text. Other meta data can also be gathered to support searching. For example, time of day, date, latitude, longitude, and so on.

Locally, perhaps within the multiple media capture modules, further media recognition processing, although not needed, can be added. If employed, such processing may operate as a substitute for the remote media processing system 104. For example, all media recognition could be performed on the second multimedia data stream locally. Alternatively, local processing could operate cooperatively with the remote media processing system 104. For example, the remote media processing system 104 can identify characteristics of common elements within a given user's media while the media capture module 102 can prompt interaction from the user to gather textual and verbal descriptions of those common elements. In addition, the media capture module 102 can store recognition data associated with each such element, and use the stored recognition data to locally process new multimedia data streams.

Second Exemplary Remote Processing Video Camera System

FIG. 2 illustrates a block diagram of a second exemplary remote processing media system according to an exemplary embodiment of the present disclosure. An exemplary media system 200 locally records and/or stores images, video, and/or audio representing a scene in its field of view into a multimedia data stream. The exemplary media system 200 remotely extracts and/or frames one or more particular objects from the images, video, and/or audio of the multimedia data stream and/or from previously recorded multimedia data streams to provide a processed multimedia data stream for playback. The exemplary media system 200 includes a media capture module 261, a remote media processing system 202, a cloud storage system 223, and a media playback device 271 that are communicatively coupled via the communication network 110. The media capture module 261, the cloud storage system 223, the remote media processing system 202 and the media playback device 271 can represent an exemplary embodiment of the media capture module 102, the remote media processing system 104, the remote storage system 106, and the media playback device 108, respectively.

The media capture module 261 records and/or stores a scene in its field of view as a multimedia data stream. Additionally, the media capture module 261 can communicate with the remote media processing system 104 to assist the remote media processing system 104. The media capture module 261 includes an upload/security support module 263, imager(s) and microphone(s) module 265, a media pre-processing module 267, a local storage module 268, and a cloud interface module 269.

The imager(s) and microphone(s) module 265 can record a still image of the scene at a particular instance in time, commonly referred to as an image, a series of the images over a duration in time that represents the scene in motion, commonly referred to a video, a representation of various sounds occurring within, or near, the scene, commonly referred to an audio, or any combination thereof to provide the multimedia data stream. In some embodiments, the imager(s) and microphone(s) module 265 includes a wide angle or panoramic lens, such as pin hole and/or fish eye to provide some examples, for capturing the images and/or the video within the scene. The wide angle lens refers to a lens, or series of lens, whose effective focal length is substantially smaller than a focal length of a non-wide angle lens for a given film plane.

The local storage module 268 stores the images, the video, and/or the audio of the multimedia data stream. The local storage module 268 can include random access memory (RAM), magnetic disk storage media, optical storage media, flash memory devices, or any other suitable electrical, mechanical, electromechanical device that is capable of storing the multimedia data stream.

The media pre-processing module 267 pre-processes the multimedia data stream before providing it to the remote media processing system 202. This pre-processing can include various imaging processing techniques such as cropping, image straightening, red-eye effect removal, contrast adjustment, color adjustment, image retouching to provide some examples. Typically, the media pre-processing module 26 can retrieve the multimedia data stream from the local storage module 268 for pre-processing or can pre-processes the multimedia data stream before being stored by the local storage module 268. The media pre-processing module 267 can cause a pre-processed multimedia data stream to be stored in the local storage module 268 along with, or in lieu, of the multimedia data stream.

Additionally, the media capture module 261 communicates with the remote media processing system 202 to assist the remote media processing system 202. The cloud interface module 269 provides an interface with the remote media processing system 202 to direct processing by the remote media processing system 202 in response to commands from the media capture module 261. These commands can be automatically generated by the media capture module 261 and/or be generated in response to a user of the media capture module 261. For example, the user of the media capture module 261 can identify one or more particular objects, such as one or more particular people, one or more particular animals, or one or more particular objects, one or more particular scenes, or one or more particular voices, and/or or one or more particular backgrounds to provide some examples, from the multimedia data stream for selecting, extracting, and/or merging using the cloud interface module 269. The user of the media capture module 261 can touch, or make a gesture around, the one or more particular objects as the multimedia data stream is being played back. The cloud interface module 269 can send the commands to cause the remote media processing system 202 to select, extract, and/or merge these one or more particular objects from the multimedia data stream. The commands can cause the remote media processing system 202 to identify and/or recognize the one or more particular objects from previously recorded multimedia data streams of other scenes that are stored within the cloud storage system 223. These other scenes can be different scenes than depicted in the multimedia data stream or similar scenes as depicted in the multimedia data stream at different times to provide some examples.

The commands can also identify various processing parameters of the processing to be performed by the remote media processing system 202. These various processing parameters can include preferred output type, such as image, video, and/or audio to provide some examples, to be played back by the media playback device 271, a length, such as in time or in bytes to provide some examples, of the multimedia data streams to be played back by the media playback device 271, one or more instances, or a range of instances, for which the previously recorded multimedia data stream was recorded.

The upload/security support module 263 operates in conjunction with the cloud interface module 269 to establish a secure connection, such as a secure interface through a web browser to provide an example, to the remote media processing system 202. The media capture module 261 can then securely, through authentication and/or authorization to provide some examples, provide the multimedia data stream and/or the pre-processed multimedia data stream from the local storage module 268 as well as the commands from the cloud interface module 269 and/or other commands from other modules within the remote media processing system 202 to the remote media processing system 202 via this secure connection.

The remote media processing system 202 operates in conjunction with the cloud storage system 223 to process multimedia data streams, such as the multimedia data stream and/or the previously recorded multimedia data streams, to provide the processed multimedia data stream. The remote media processing system 202 includes a cloud service processing module 251 and a cloud support processing module 241. The cloud service processing module 251 provides an interface between the media capture module 261 and the cloud support processing module 241. The cloud service processing module 251 directs the cloud support processing module 241 to process the multimedia data stream and/or the previously recorded multimedia data streams in response to commands from the media capture module 261 and/or the media player 273. The cloud service processing module 251 includes a user upload/download services module 253, a reference query interaction module 255, an extracting servicing module 257, and a sign in—account service module 259.

The user upload/download services module 253 operates in conjunction with the media capture module 261 to receive the multimedia data stream via the communication network 110, commonly referred to as upload, and in conjunction with the media playback device 271 to provide the processed multimedia data stream to the communication network 110, commonly referred to as download. The uploading of the multimedia data stream and/or the downloading of the processed multimedia data stream can occur in real-time or in non-real time, namely at some point in the future.

The extracting servicing module 257 directs the cloud support processing module 241 to recognize the one or more particular objects from the multimedia data stream and/or from the previously recorded multimedia data streams in response to the commands from the media capture module 261 and to provide the processed multimedia data stream. Typically, the extracting servicing module 257 provides one or more pointers corresponding to the one or more particular objects to the cloud support processing module 241 in response to the commands from the media capture module 261. Additionally, the extracting servicing module 257 provides the various processing parameters to direct the processing to be performed on the multimedia data stream and/or the previously recorded multimedia data streams by the cloud support processing module 241.

The reference query interaction module 255 operates in conjunction with the media capture module 261 and/or the media playback device 271 to assist the cloud support processing module 241 in recognizing one or more newly discovered objects, such as any newly discovered images, video, and/or audio to provide some examples, within the multimedia data stream and/or the previously recorded multimedia data streams. Typically, the reference query interaction module 255 receives a response from the cloud support processing module 241 that one or more objects within the multimedia data stream and/or the previously recorded multimedia data streams have not been previously recognized. In this situation, the reference query interaction module 255 operates in conjunction with the media capture module 261 and/or the media playback device 271 to identify these not previously recognized objects and, optionally, to update the previously recognized objects stored in the cloud storage system 223 to include these previously unrecognized objects. Typically, the reference query interaction module 255 can receive a portion of previously recorded multimedia data streams that include image, video, and/or audio of the previously unrecognized objects. The reference query interaction module 255 can provide this portion to the media capture module 261 and/or the media player 273 for recognition.

The sign in—account service module 259 establishes a secure connection, such as the secure interface through a web browser to provide an example, to the media capture module 261. Typically, the sign in—account service module 259 can then securely, after authentication and/or authorization, receive the multimedia data stream and/or the previously recorded multimedia data stream from the local storage module 268 as well as the commands from the cloud interface module 269 and/or other commands from other modules within the exemplary media system 200 via this secure connection.

The cloud support processing module 241 processes the multimedia data stream and/or the previously recorded multimedia data streams in response to commands from the cloud service processing module 251 to provide the processed multimedia data stream. The cloud support processing module 241 includes a recognition module 242 and an extraction module 243. The recognition module 242 identifies one or more objects within the multimedia data stream and/or the previously recorded multimedia data streams and recognizes the one or more objects as being the one or more particular objects using various image, video, and/or audio recognition techniques. These image, video, and/or audio recognition techniques can include video based, audio based, and/or face, person, object, and scene based recognition. Typically, these image, video, and/or audio recognition techniques compare a portion of previously recorded multimedia data streams that include image, video, and/or audio of various previously recognized objects that are stored within the cloud storage system 223 with the one or more objects within the multimedia data stream and/or the previously recorded multimedia data streams to recognize the one or more particular objects. The recognition module 242 can request assistance from the cloud service processing module 251 for recognition of newly discovered objects, namely those one or more objects that do not match the various previously recognized objects. For example, when the multimedia data stream and/or the previously recorded multimedia data streams includes one or more new objects that have not been previously recognized by the recognition module 242, the recognition module 242 can request assistance from the cloud service processing module 251 to recognize these new objects as previously recognized objects in the future.

The extraction module 243 extracts and/or frame portions of the multimedia data stream and/or the previously recorded multimedia data streams that include the one or more particular objects, as recognized by the recognition module 242, from the images, video, and/or audio of the multimedia data stream and/or the previously recorded multimedia data streams and compile a new multimedia data stream that includes these portions as the processed multimedia data stream. The extraction module 243 can select, extract, and/or merge the one or more particular objects from the images, video, and/or audio of the multimedia data stream and/or the previously recorded multimedia data streams as well as selecting, extracting, and/or merging other portions of the images, video, and/or audio of the multimedia data stream and/or the previously recorded multimedia data streams that surround the one or more particular objects. Optionally, the extraction module 243 can transform the one or more particular objects from a two dimensional (2D) representation for viewing as a three dimensional (3D) representation to enable playback in 3D. The extraction module 243 can also perform various audio and/or video encoding, decoding, and/or transcoding on the multimedia data stream and/or the previously recorded multimedia data streams.

The cloud storage system 223 stores multimedia data streams and previously recognized objects of one or more users in a corresponding user data module from among one or more user data modules 215. The cloud storage system 223 stores the multimedia data stream and/or the previously recorded multimedia data streams for later retrieval by the remote media processing system 104 in a compressed/raw media module 221. The compressed/raw media module 221 stores the multimedia data stream as well as previously recorded multimedia data streams for later retrieval by the remote media processing system 104. The compressed/raw media module 221 can store these data stream in their raw, or uncompressed, form in a collected data module or can optionally compress these data streams before their storage in the collected data module. The compressed/raw media module 221 can optionally include an associated meta data module 223 to collect various metadata about the multimedia data stream as well as previously recorded data streams. This metadata can include an identification of the author, or user, whom recorded the multimedia data stream and/or the previously recorded multimedia data streams as well as an identification of media capture module 261, such as type of device, name of device, manufacturer of device to provide some examples, that recorded these data streams. Additionally, the metadata can includes various parameters, such as standard or protocol used to record these data streams, resolution of these data streams, date of recording of these data streams, time of record of these data streams, to provide some examples, to assist in identification of the multimedia data stream and the previously recorded multimedia data streams for their later retrieval.

Also, the cloud storage system 223 stores previously recognized objects such as one or more previously recognized people, one or more previously recognized animals, or one or more previously recognized objects, one or more previously recognized scenes, or one or more previously recognized voices, and/or or one or more previously recognized backgrounds to provide some examples, in a preprocessed matched information module 233, a reference sounds/pointers module 224, a reference image/pointer module 227, and a characterization data module 231.

The preprocessed matched information module 233 stores portions of previously recorded multimedia data streams that include previously recognized objects which were previously identified and/or recognized by the recognition module 242. The preprocessed matched information module 233 can index the portions of previously recorded multimedia data streams using various pointers. The preprocessed matched information module 233 can provide these portions of previously recorded multimedia data streams to the remote media processing system 202.

The reference sounds/pointers module 224 stores various multimedia data streams, or portions thereof, that include various well-known audio. These well-known audio can include well-known voices, well-known backgrounds, and/or well-known animals to provide some examples.

The reference image/pointer module 227 stores various multimedia data streams, or portions thereof, that include well-known images, and/or video, that are shared between the one or more user data modules 215. These well-known images, and/or video can include well-known persons, well-known objects, and/or well-known scenes to provide some examples.

The characterization data module 231 stores various multimedia data streams, or portions thereof, that include various reference images, video, and/or audio to assist the remote media processing system 202 in recognizing the one or more particular objects from the multimedia data stream and/or the previously recorded multimedia data streams. These reference images, video, and/or audio can include various historical variations that indicate changes to the one or more particular objects that can occur over time. For example, the historical variations can indicate how these reference images, video, and/or audio can appear in the past or the future.

The media playback device 271 plays back the processed multimedia data stream from the remote media processing system 104. Additionally, the media playback device 271 can communicate with the remote media processing system 202 to assist the remote media processing system 202. The media playback device 271 includes a media player 273, an enhanced extraction user interface 275, and a security/access user interface 277.

The media player 273 can display the images and/or the video and/or play back the audio within the processed multimedia data stream. The media player 273 can include a monitor, a television, a mobile communications device, such as a smart phone or portable computer, or any other electronic device that is capable of playing back the processed multimedia data stream that will be apparent to those skilled in the relevant art(s) without departing from the spirit and scope of present disclosure. The media player 273 includes a real time extracting download interface to direct processing provide commands to the enhanced extraction user interface 275.

The enhanced extraction user interface 275 can communicate with the remote media processing system 202 to assist the remote media processing system 202. The enhanced extraction user interface 275 includes an extraction element identification module and recognition assist module. The extraction element identification module and the recognition assist module generate the commands in response to the real time extracting download interface to direct processing by the remote media processing system 202. These commands can be automatically generated by the enhanced extraction user interface 275 and/or be generated in response to a user of the media player 273. For example, the media player 273 can playback the processed multimedia data stream using the interface. A user of the media playback media playback device 271 can identify the one or more particular objects from the processed multimedia data stream for selecting, extracting, and/or merging by the remote media processing system 202. The user of the media playback device 271 can simply touch, or make a gesture around, the one or more particular objects as the processed multimedia data stream is being played back. The extraction element identification module and the recognition assist module can send the commands to cause the remote media processing system 202 to select, extract, and/or merge these one or more particular objects from the processed multimedia data. The commands can also identify various processing parameters of the processing to be performed by the remote media processing system 202. These various processing parameters can include preferred output type, such as image, video, and/or audio to provide some examples, to be played back by the media playback device 271, a length, such as in time or in bytes to provide some examples, of the multimedia data streams to be played back by the media playback device 271, one or more instances, or a range of instances, for which the previously recorded multimedia data stream was recorded.

The security/access user interface 277 establishes a secure connection, such as a secure interface through a web browser to provide an example, to the remote media processing system 202. The media player 273 can then securely, through authentication and/or authorization to provide some examples, receive the processed multimedia data stream from the remote media processing system 202 and/or provide the commands to the remote media processing system 202 via this secure connection.

Third Exemplary Remote Processing Video Camera System

FIG. 3 illustrates a block diagram of a third exemplary remote processing media system according to an exemplary embodiment of the present disclosure. An exemplary media system 300 locally records and/or stores images, video, and/or audio representing a scene in its field of view into a multimedia data stream. The exemplary media system 300 remotely extracts and/or frames one or more particular objects from the images, video, and/or audio of the multimedia data stream and/or from images, video, and/or audio of previously recorded multimedia data streams to provide a processed multimedia data stream for playback. The exemplary media system 300 includes a media capture module 302, a remote media processing and storage system 304, and a media playback device 306 that are communicatively coupled via a communication network 308. The exemplary media system 300 can represent an exemplary embodiment of the exemplary media system 100 and/or of the exemplary media system 200.

The media capture module 302 represents a mobile communications device for recording and/or storing images, video, and/or audio representing a scene in its field of view into a multimedia data stream. The media capture module 302 performs substantially similar functions as discussed above in regard to the media capture module 102 and/or the media capture module 261 and will not be described in further detail.

The remote media processing and storage system 304 processes multimedia data streams, such as the multimedia data stream and/or previously recorded multimedia data streams to provide some examples, to provide a processed multimedia data stream. The remote media processing and storage system 304 performs substantially similar functions as discussed above in regard to the remote media processing system 104, the remote storage system 106, the remote media processing system 202, and/or the cloud storage system 223 and will not be described in further detail.

The media playback device 306 plays back the processed multimedia data stream from the remote media processing and storage system 304. The media playback device 306 performs substantially similar functions as discussed above in regard to the media playback device 108 and/or the media playback device 271 and will not be described in further detail.

The communication network 308 communicatively couples the media capture module 302, the remote media processing and storage system 304, and the media playback device 306. The communication network 308 can include a first communication pathway 350 for communications between the media capture module 302 and the remote media processing and storage system 304, a second communication pathway 352 for communications between the remote media processing and storage system 304 and the media playback device 306, and a third communication pathway 354 for communications between the media capture module 302 and the media playback device 306. The first communication pathway 350, the second communication pathway 352, and/or the third communication pathway 354 can be part of any suitable wireless communication network, such as a cellular network to provide an example, any suitable wired communication network, such as a fiber optic network or cable network to provide some examples, or any combination of wireless and wired communication networks.

First Exemplary Local Processing Video Camera System

FIG. 4 illustrates a block diagram of an exemplary local processing media system according to an exemplary embodiment of the present disclosure. An exemplary media system 400 locally records and/or stores images, video, and/or audio representing a scene in its field of view into a multimedia data stream. The exemplary media system 400 locally extracts and/or frames one or more particular objects from the images, video, and/or audio of the multimedia data stream and/or previously recorded multimedia data streams to provide a processed multimedia data stream for playback. The exemplary media system 400 includes a media capture module 402, a local media processing system 404, a local storage module 406, and a media playback device 408. Typically, the media capture module 402, the local media processing system 404, the local storage module 406, and the media playback device 408 are integrated onto a single platform.

The media capture module 402 records and/or stores a scene in its field of view as a multimedia data stream. The media capture module 402 includes imager(s) and microphone(s) module 410 to record the images, the video, and/or the audio representing the scene in its field of view into the multimedia data stream. The imager(s) and microphone(s) module 410 provides the multimedia data stream in a substantially similar manner as the imager(s) and microphone(s) module 265 and will not be described in further detail.

The local media processing system 404 operates in conjunction with the local storage module 406 to process multimedia data streams and/or the previously recorded multimedia data streams that are stored in the local storage module 406. The local media processing system 404 includes a recognition module 410 and an extraction module 412. The recognition module 410 and the extraction module 412 operate in a substantially similar manner as the recognition module 242 and the extraction module 243, respectively, and will not be described in further detail.

The recognition module 410 identifies one or more objects within the multimedia data stream and/or previously recorded multimedia data streams and recognizes the one or more objects as being the one or more particular objects using various audio and/or video recognition techniques. These audio and/or video recognition techniques can include video based, audio based, and/or face, person, object, and scene based recognition. Typically, these audio and/or video recognition techniques compare various previously recognized objects stored in the local storage module 406 with the one or more objects within the multimedia data stream to recognize the one or more particular objects.

The extraction module 412 extracts and/or frames the one or more particular objects from the multimedia data stream and/or the previously recorded multimedia data streams to provide the processed multimedia data stream. Optionally, a 2D-3D Viewpoint module 414 can transform the one or more particular objects from a two dimensional (2D) representation for viewing as a three dimensional (3D) representation to enable playback in 3D.

The local storage module 406 stores multimedia data streams and previously recognized objects of one or more users in a substantially similar manner as the cloud storage system 223.

The media playback device 408 plays back the processed multimedia data stream from the remote media processing system 104. Additionally, the media playback device 408 can communicate with the local media processing system 404 to assist the local media processing system 404. The media playback device 408 includes a media player 418 and an enhanced extraction user interface 420. The media player 418 and the enhanced extraction user interface 420 operate in a substantially similar manner as the media player 273 and the enhanced extraction user interface 275, respectively, and will not be described in further detail.

Exemplary Media Capture Modules

FIG. 5 illustrates a block diagram of a first media capture module that can be used within the exemplary video camera system according to an exemplary embodiment of the present disclosure. A media capture module 500 represents a stationary media capture module for recording and/or storing images, video, and/or audio representing a scene in its field of view into a multimedia data stream. The media capture module 500 includes a media recording module 502 and a stationary mount 504. The media capture module 500 can represent an exemplary embodiment of the media capture module 102, the media capture module 261, and/or the exemplary media system 400.

The media recording module 502 includes one or more media recording devices 506.1 through 506.i and a processing module 508. The media recording devices 506.1 through 506.i records images, video, and/or audio representing a scene in their fields of view into the multimedia data stream. Typically, the media recording module 502 includes a sufficient number of the media recording devices 506.1 through 506.i to capture a wide angle or panoramic view of the scene; however, those skilled in the relevant art will recognize that any suitable number of the media recording devices 506.1 through 506.i can be used without departing from the spirit and scope of the present disclosure. The media recording devices 506.1 through 506.i are substantially similar to one another, but can include a different number of audio capture devices, illumination devices, and image capturing devices; therefore one the media recording device 506.1 is to be described in further detail below.

The media recording device 506.1 records images, video, and/or audio representing the scene in its fields of view. The media recording device 506.1 includes one or more illumination devices 510.1 through 510.k, one or more audio capture devices 512.1 through 512.m, and an image capture device 514. The one or more illumination devices 510.1 through 510.k to provide illuminate the scene. This illumination can be characterized as being a relatively short duration, typically 1/1000 to 1/200 of a second, commonly referred to as a flash, or it can be characterized as being a longer duration, and can include any suitable portion, or portions, of the electromagnetic spectrum.

The one or more audio capture devices 512.1 through 512.m can capture a representation of various sounds occurring within, or near, the scene. The one or more audio capture devices 512.1 through 512.m are typically implemented using one or more microphones although those skilled in the relevant art(s) will recognize that any electrical, mechanical, electro-mechanical device that can convert sound into electrical signal can be used without departing from the sprit and scope of the present disclosure. Typically, the one or more audio capture devices 512.1 through 512.m includes at least two audio capture devices to capture the various sounds occurring within, or near, the scene, as stereophonic sounds or, more commonly, stereo, although those skilled in the relevant art(s) will recognize that any suitable number of audio capture devices can be used without departing from the sprit and scope of the present disclosure.

The image capture device 514 records a still image of the scene at a particular instance in time and/or a series of the images over a duration in time that represents the scene in motion. The image capture device can record the scene using any suitable any suitable portion, or portions, of the electromagnetic spectrum that will be apparent to those skilled in the relevant art(s) without departing from the spirit and scope of the present disclosure. Although, only one image capture device 514 is shown in FIG. 5, more than one image capture device 514 can be present within the media recording device 506.1 to record the scene in three dimensions (3D).

The processing module 508 processes the images, video, and/or audio from the one or more media recording devices 506.1 through 506.i to provide the multimedia data stream to a remote processing system, such as the remote processing system 104 and/or the remote processing system 202 to provide some examples and/or to a local processing system, such as the local media processing system 404 to provide an example.

The stationary mount 504 is coupled to the media recording module 502. Typically, the stationary mount 504 is placed within the scene to allow the media recording module 502 to records images, video, and/or audio representing the scene in its field of view. The stationary mount 504 represents a stationary or fixed mount to stabilize the media recording module 502. As shown in FIG. 5, the stationary mount 504 can telescope to adjust a field of view of the media recording module 502. The stationary mount 504 as shown in FIG. 5 is for illustrative purposes only; those skilled in the relevant art(s) will recognize that other stationary or fixed mounts can be used without departing from the spirit and scope of the present disclosure.

FIG. 6 illustrates a block diagram of a second media capture module that can be used within the exemplary video camera system according to an exemplary embodiment of the present disclosure. A media capture module 600 represents a mobile media capture module for recording and/or storing images, video, and/or audio representing a scene in its field of view into a multimedia data stream. The media capture module 600 includes media recording devices 602.1 through 602.i and a processing module 604. The media recording devices 602.1 through 602.i and the processing module 604 operate in a substantially similar manner as the media recording devices 506.1 through 506.i and the processing module 508, respectively, and will not be described in further detail. The media capture module 600 can represent an exemplary embodiment of the media capture module 102, the media capture module 261, and/or the exemplary media system 600.

CONCLUSION

It is to be appreciated that the Detailed Description section, and not the Abstract section, is intended to be used to interpret the claims. The Abstract section can set forth one or more, but not all exemplary embodiments, of the disclosure, and thus, are not intended to limit the disclosure and the appended claims in any way.

The disclosure has been described above with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed.

It will be apparent to those skilled in the relevant art(s) that various changes in form and detail can be made therein without departing from the spirit and scope of the disclosure. Thus the disclosure should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents. 

What is claimed is:
 1. A media capture system supporting a user, the media capture system comprising: a media capture module configured to operate over a series of independent scenes to capture a plurality of independent multimedia data streams; a media processing system configured to recognize a portion of each of the plurality of independent multimedia data streams as being a particular object that is associated with a previously recognized object; and a user input interface configured to receive user information to direct the media processing system, wherein the media processing system is further configured to select, to extract, and to merge the particular object from among the portion of each of the plurality of independent multimedia data streams in response to the user information to provide a processed multimedia data stream.
 2. The media capture system of claim 1, wherein the media capture module is further configured to provide the user information to the medium processing system to direct processing of the plurality of independent multimedia data streams.
 3. The media capture system of claim 2, wherein the user information comprises at least one of: commands to cause the media processing system to recognize the particular object from the plurality of independent multimedia data streams; commands to identify the particular object for which the media processing system is to recognize from the plurality of independent multimedia data streams; or commands to identify processing parameters of the processing to be performed by the media processing system, the processing parameters including: a preferred output type of the processed multimedia data stream, and a length of the processed multimedia data stream.
 4. The media capture system of claim 1, wherein the media processing system is further configured to assist in identifying common objects between the plurality of independent multimedia data streams, and wherein the user input interface is further configured to receive the user information regarding the common media elements.
 5. The media capture system of claim 1, wherein the media processing system is further configured to identify an object within the plurality of independent multimedia data streams and to recognize the object as being the particular object.
 6. A media capture device supporting a user, the media capture device comprising: a media capture module configured to operate over a series of independent scenes to capture a plurality of independent multimedia data streams; a media processing system configured to receive user information relating to a particular object that is within a portion of at least one of the plurality of independent multimedia data streams and to select, to extract, and to merge the particular object from among the portion of each of the plurality of independent multimedia data streams in response to the user information to provide a processed multimedia data stream; and a storage interface configured to support storage of the plurality of independent media data and a plurality of previously recognized objects.
 7. The media capture device of claim 6, wherein the media processing system is further configured to identify an object within the plurality of independent multimedia data streams in response to the user information and to identify the object as being one of the plurality of previously recognized objects.
 8. The media capture device of claim 7, wherein the media processing system is further configured to recognize the object as being the particular object by matching the object with one of the plurality of previously recognized objects.
 9. The media capture device of claim 6, wherein the media processing system is further configured to request assistance from the media capture module when the media processing system cannot match the object with one of the plurality of previously recognized objects.
 10. The media capture device of claim 6, further comprising: a user input interface configured to receive the supplemental user information.
 11. The media capture device of claim 6, further comprising: a cloud storage system configured to store the plurality of independent media data and the plurality of previously recognized objects.
 12. A media processing system that interacts with a storage to support a user, the storage containing a plurality of independent media data streams, each of the plurality of independent media data streams being captured over a series of independent scenes, the media processing system comprising: a service processing module configured to identify an object that is associated with the plurality of independent media data streams and to recognize the object as being a particular object that is associated with a previously recognized object from among a plurality of previously recognized objects; and a support processing module configured to extract and to merge the particular object from a portion of at least one of the plurality of independent multimedia data streams to provide a processed multimedia data stream.
 13. The media processing system of claim 12, wherein the service processing module is further configured to identify the object within the plurality of independent multimedia data streams in response to user information.
 14. The media capture device of claim 13, wherein the service processing module is further configured to recognize the object as being the particular object by matching the object with one of the plurality of previously recognized objects.
 15. The media capture device of claim 13, wherein the service processing module is further configured to request assistance from a user when the first service cannot match the object with one of the plurality of previously recognized objects.
 16. The media capture device of claim 15, wherein the service processing module is further configured to receive a pointer from the user to identify the object as being a new recognized object that has not been previously recognized.
 17. The media capture device of claim 12, wherein the support processing module is further configured to extract and to merge the particular object from the portion of the at least one of the plurality of independent multimedia data streams to compile a new multimedia data stream.
 18. The media capture device of claim 12, further comprising: an imager and microphone module configured to record a scene in its field of view from among the series of independent scenes as a media data stream from among the plurality of independent media data streams.
 19. The media capture device of claim 18, wherein the service processing module is communicatively coupled to the imager and microphone module via a communications network.
 20. The media capture device of claim 18, wherein at least two of the service processing module, the support processing module, and the imager and microphone module are implemented as an integrated electronic device. 